翻訳と辞書
Words near each other
・ Katyń Memorial
・ Katyń Memorial (Jersey City)
・ Katyń Museum, Warsaw
・ Katz
・ KATZ (AM)
・ Katz (surname)
・ Katz and Leavitt Apartment House
・ Katz Broadcasting
・ Katz Castle
・ Katz centrality
・ Katz Editores
・ Katz Group of Companies
・ Katz railway station
・ Katz syndrome
・ Katz v. United States
Katz's back-off model
・ Katz's Delicatessen
・ Katz, British Columbia
・ Katza
・ Katzbach (Kraichbach)
・ Katzbach Railway
・ Katzbalger
・ Katze
・ Katze (village)
・ Katze im Sack
・ Katzelmacher
・ Katzelsdorf
・ Katzen
・ Katzen (performer)
・ Katzen Arts Center


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Katz's back-off model : ウィキペディア英語版
Katz's back-off model
Katz back-off is a generative ''n''-gram language model that estimates the conditional probability of a word given its history in the ''n''-gram. It accomplishes this estimation by "backing-off" to models with smaller histories under certain conditions. By doing so, the model with the most reliable information about a given history is used to provide the better results.
==The method==

The equation for Katz's back-off model is: 〔Katz, S. M. (1987). Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing, 35(3), 400–401. 〕
:
\begin
& P_ (w_i \mid w_ \cdots w_) \\()
= \cdots w_} \dfracw_)})} & \text C(w_ \cdots w_i) > k \\()
\alpha_} P_(w_i \mid w_ \cdots w_) & \text
\end
\end

where
: ''C''(''x'') = number of times ''x'' appears in training
: ''w''''i'' = ''i''th word in the given context
Essentially, this means that if the ''n''-gram has been seen more than ''k'' times in training, the conditional probability of a word given its history is proportional to the maximum likelihood estimate of that ''n''-gram. Otherwise, the conditional probability is equal to the back-off conditional probability of the "(''n'' − 1)-gram".
The more difficult part is determining the values for ''k'', ''d'' and ''α''.
k is the least important of the parameters. It is usually chosen to be 0. However, empirical testing may find better values for k.
d is typically the amount of discounting found by Good–Turing estimation. In other words, if Good–Turing estimates C as C^
*, then d = \frac
To compute \alpha, it is useful to first define a quantity β, which is the left-over probability mass for the (''n'' − 1)-gram:
:\beta_} = 1 - \sum_) > k \} } d_} \frac w_)})}
Then the back-off weight, α, is computed as follows:
:\alpha_} = \frac}} ) \leq k \} } P_(w_i \mid w_ \cdots w_)}
The above formula only applies if there is data for the "(''n'' − 1)-gram". If not, algorithm skips N-1 entirely and uses the Katz estimate for N-2. (and so on until an N-gram with data is found)

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Katz's back-off model」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.